Direct Execution

Often when we virtualize something we add a layer of indirection.
- That layer affects performance
- Can’t afford to do that in this case (can’t afford to turn every instruction into two or more)
Limited Direct Execution
- Run the program directly
  - i.e., no interpretation / virtualization
  - But, limit what the program can do.
Restricted operations:
- What should a user-provided process (e.g., C code an ordinary user writes and compiles) not be allowed to do?
- I/O (could ignore / skip permissions checks)
- Grant itself more resources (CPU time, disk quota, etc.)
- Create new users
- Change other people’s passwords.

Kernel and User mode

Most CPUs have a feature whereby they can be set to a privilege level.
We will primary refer to two levels: kernel mode and user mode
- user mode is for “ordinary” processes.
- kernel mode is for the operating system (and other trusted processes).
Some hardware instructions (e.g., I/O instructions can only be executed while the CPU is in kernel mode).
on x86,
- “kernel mode” is also known as “Ring 0”.
- “user mode” is “Ring 4”
- No modern OS uses Rings 1 and 2.
Certain memory addresses are only accessible while in Ring 0.
So, when your program is running, there are two sets of instructions in memory:
1. The instructions for your code (plus any libraries you use)
2. The instructions for the operating system: Code you call when you want the OS to do something for you (e.g., I/O.)
Show a memory map with separate user code and kernel code.
Walk through the boot process.

Reserve certain memory addresses for OS code (e..g, above 0x7FFF_FFFF).
Instructions located at these addresses are privileged.
Call OS code with a standard jal.
What are some ways that this system can fail?
- selectively jumping to the middle of a kernel routine can skip checks.
- Could also trash kernel data, thereby harming other users.
Suggest other ideas.

There is a special instruction called a trap that
1. jumps to the beginning of an OS routine
2. places the CPU in “kernel mode”.
How do we define where these routines begin?
- A trap table.
- The trap instruction looks in a special register for the location (memory address) of the handler.
- This could be a single register with a memory address (This is what MIPS does)
- This could be an “array” of registers where a parameter tells us which one to run (this is what Intel does)
In either case, the handler routine gathers any additional data from registers, then runs the OS code in privileged mode.
At the completion of the system call a special instruction runs to
1. Put the CPU back in “user mode”
2. Return from the system call.
The first thing a handler must do is save register values (and any other data) that might get clobbered by the handler. Some kernels have their own stack to store this data.
How does this special register (or register array) get initialized?
- When you boot up the CPU, it boots in privileged mode.
- It loads the OS code (system call code) into the appropriate area of memory and also sets up the vector.
- CPU gets switched to user mode before your first program starts running.
This mechanism is not just for privilege changing. This same interrupt mechanism is used for things like
1. Hardware interrupts (disk drive network card, etc.) requesting service.
2. Exceptions (divide by zero errors, invalid instructions,)

When programming in C, system calls look like library functions.
That’s because the are.
When you want to make a read system call, you actually call a user-space function named read from the standard library.
This function, while running in user space, sets up the registers as needed, then executes the trap instruction. This switches the CPU to kernel mode and runs the privileged kernel code that does the read.

Option 1: Cooperative
- Wait for process to make a yield system call.
How can this go wrong?
1. Maliciously refuse to yield.
2. Get stuck in an infinite loop.
What else can we do?
- Set up a timer that periodically transfers control to the kernel. Kernel can check the time and swap processes when necessary.
The timer is a mechanism.
The corresponding policy includes
- How often the timer should go off?
- Does the timer re-set when a program voluntarily yields?
- Should the current process be switched?
- Which process should run next?
Notice that registers may be saved twice:
1. The interrupt causes registers to be saved.
  - This may be done automatically by the hardware.
  - Saved on the kernel stack
2. If a context switch is made, then the registers are also saved In the process’s in-kernel data structure.